A chromosomal-scale genome assembly of modern cultivated hybrid sugarcane provides insights into origination and evolution

Sugarcane is a vital crop with significant economic and industrial value. However, the cultivated sugarcane’s ultra-complex genome still needs to be resolved due to its high ploidy and extensive recombination between the two subgenomes. Here, we generate a chromosomal-scale, haplotype-resolved genome assembly for a hybrid sugarcane cultivar ZZ1. This assembly contains 10.4 Gb genomic sequences and 68,509 annotated genes with defined alleles in two sub-genomes distributed in 99 original and 15 recombined chromosomes. RNA-seq data analysis shows that sugar accumulation-associated gene families have been primarily expanded from the ZZSO subgenome. However, genes responding to pokkah boeng disease susceptibility have been derived dominantly from the ZZSS subgenome. The region harboring the possible smut resistance genes has expanded significantly. Among them, the expansion of WAK and FLS2 families is proposed to have occurred during the breeding of ZZ1. Our findings provide insights into the complex genome of hybrid sugarcane cultivars and pave the way for future genomics and molecular breeding studies in sugarcane.

As the first report of a phased chromosome level assembly of a complex hybrid sugarcane genome, this is an important contribution to the sugarcane community.The authors are to be congratulated for overcoming the technical challenges in assembling this complex polyploid genome.

Two specific questions and comments:
Lines 96/97: The texts mentions 'artificial sequences' and 'artificial k-mers'.Does this refer to kimeric sequences generated during sequencing -i.e.artifacts?Is 'artificial' the best word to use here?Or artifactual sequences?Lines 222/223: Its seems that the inference here is that because the variety ZZ1 has more sugar transport gene members than the reference genomes of S.spontaneum (Clone AP85-441) and S.officinarum (LA Purple), that gene expansion occurred during breeding.It is known, however, that at least 19 different S.officinarum clones and a range of S.spontaneum, S,barberi and S.sinense clones contribute to the genealogy of modern hybrids.It is likely that the additional gene members found in ZZ1 are derived from other ancestral clones of SO and SS, and not due to gene expansion during the several generations of breeding that led to variety ZZ1.
One general comment relating to genome nomenclature and its implications for the sugarcane genomics community.
It has been recognized from some time that the origin of modern sugarcane hybrids are S.spontaneum with x=8 and S.officinarum with x=10, and that structural rearrangements between the genomes of the two species will lead to issues in chromosome nomenclature.Without standardization, this will lead to confusion in the literature.For example, in this submission, the nomenclature of LA Purple was used (based on sorghum chrm nomenclature), with the results that Chr5 and Chr 8 are 'missing' for SS, while Chr 9 and Chr10 are 'present' for SS -whereas the published genome for SS (Zhang et al, 2018), names SS chromosomes for 1 to 8 -i.e.there is a lack of correspondence between the chromosome nomenclature of Zhang 2018, with this submission.
The proposed solution to this problem was published by Garsmeur et al 2018, by de By adopting this nomenclature, future SS genomes can be named 1 to 8 and future hybrid genomes named 1 to 10, without confusion.It is strong suggested to the authors to consider adopting the nomenclature proposed by Garsmeur 2018 in order to facilitate and standardize sugarcane chromosome nomenclature.
Reviewer #2: Remarks to the Author: Overall evaluation ================== This manuscript presents a new exciting genome assembly and analysis of the species Saccharum officinarum.This species has a quite complex genome and previous assemblies are incomplete.The latest advances in the sequencing technologies have allowed to the authors to produce an excellent genome assembly.Nevertheless, there are several points that the authors should resolve: 1-The lack of important details in the material and methods, which makes difficult an full assessment; 2-A deeper discussion on the results with a better contextualization and the inclusion of many articles about the complex sugarcane genome; 3-An extended analysis of the genome structure and evolution of this species; 4-Revision of the format of the manuscript including the coherence between text and tables.So overall, it is a worthy to read manuscript.For sure, something publishable from which the community will benefit once the major concerns have been resolved.
• The cytogenetic characterization of the sugarcane accession ZZ1 described a genome of approx.9 Gb and 114 chromosomes of which 68 chromosomes derived from S. officinarum, 31 from S. spontaneum, and 15 representing recombination between sub-genomes.
• The authors did not find sub-genome dominance for gene expression, although they found that 7.1% of the gene expression was biased to S. officinarum, and 3.0% to S. spontaneum.
• The pathogen response to Pokkah boeng disease is driven mainly by the S. officinarum, sub-genome rather than the S. spontaneum according to a transcriptomic analysis.Nevertheless, the nonoverlapping function of the genes of each sub-genome indicates a possible cooperative response between them.

B. Originality and significance
This manuscript presents a new exciting assembly of the complex sugarcane genome.There have been several genome assemblies of Saccharum spontaneum (Zhang et al., 2018, Zhang et al., 2021) where the origin of this species is discussed.There also some fragmented genome assemblies for S. officinarum (Garsmeur et al., 2018, Souza et al., 2019), but the incompleteness of the assemblies limited the analysis.
C. Data & methodology: validity of approach, quality of data, quality of presentation. - The methodology used looks adequate although there are many missing details in the Material & Methods section that makes difficult a full assessment.For example, the HMW DNA extraction method has not been described in detail (there is only a link to 10X genomics), so it is a gap in the assessment of how the high-quality SMRTbell libraries (30-50 kb) were produced.RNA extraction method has not been specified either, or any of the versions of the program used.The Hi-C map is not very helpful to identify the possible pseudo-chromosomes.I recommend having in the supplementary a figure with the chromosomes with assigned bins and because there are many of them, split the figure in 6-7 so the readers could inspect the results in more details.In this sense, the authors should also report the number of contacts produced by the Hi-C map.The assessment of the quality of the assembly is partial.The authors have used Merqury but not reported the results about consensus quality and completeness (at least not in the result section).No assessment of the completeness of the LTR elements (e.g., using LTR_retriever).The contamination screening was performed only by Blast against bacterial genomes, but not, for example, fungi.The use of Blobtools is recommended in this case.There are some parameters missing in the gene model annotation that could be useful to assess the quality such as the number of single exon gene, length of the gene models, numbers of UTR annotated, number of genes supported by RNA-Seq data… No description of the RNA-Seq analysis, including the number of replicates per sample.The quality of the genome assembly should be described using the standards of the Earth Biogenome Project (https://www.earthbiogenome.org/assembly-standards).
F. Suggested improvements: experiments, data for possible revision. - The manuscript lacks important details in the material and method section.
The discussion also should be improved.There is not an extensive discussion of the polyploid nature of the genome in the context of origin (e.g., no dating, or comparison with the results of the Saccharum spontaneum articles (Zhang et al., 2018, Zhang et al., 2021)).The discussion is more a summary rather than a contextualization in which different results are compared with previously published results driving to new hypothesis and insights.
The authors did an excellent characterization of the origin of the different sub-genomes.Nevertheless, I think that there are several pieces that it could be interesting to explore about the history of this complex genome: • Tracing of the maternal inherence of the organelle genomes through the different rounds of hybridization and polyploidization.The authors did not mention anything about the chloroplast and mitochondria origin of the ZZ1 accession.The suppl.figure 1 states ROC1 as the maternal donor but it will be interesting to know if they derived from S. robustum or S. officinarum.
• Dating of speciation and hybridization events.The authors did estimate the divergency between the different sub-genomes using the synonymous substitution (Ks) of the ortholog pairs of SO-ZZSO, SS-ZZSS, ZZSS-ZZSO, and SO-SS.It will add value to the results and the discussion to see some age estimation associated to this Ks ratio values.The authors could even use the Miscanthus genomes (Zhang et al. 2021) to date the possible divergency dates between the different sub-genomes (Miscanthus-Saccharum split dated as 7.9 MYA according https://timetree.org/).
• Homologous exchange transposition (HET) characterization.There is not an extensive analysis of the regions of the genome that present HET.It could be interesting to have a deeper look of this part with some questions (for example, are the HET regions rich in TEs favoring the non-reciprocal exchanges?).
The manuscript needs a format revision.For example, the first time that you introduce a species name, it needs to be used as complete name (e.g., Saccharum officinarum).Then, it can be abbreviated using only the first letter of the genus (as S. officinarum).In the line 182, the species S.
officinarum is mistyped as S. officianrum and in the line 210 as Sacchuarm.In some paragraphs, the authors use the species names (S. spontaneum and S. officinarum) while in other ones they use just an abbreviation (SS or SO).The same criteria should be used for the whole text.It also needs a consistency revision between the numbers presented in the text and the tables.For example, in the line 141 is described that 257,534 protein-coding gene models were identified.Nevertheless, in the table 1, this number goes to 370,103.
G. References: appropriate credit to previous work?
Yes, in some extend, but I found some references missing like the one for the reference genome S. officinarum LA-purple.
H. Clarity and context: lucidity of abstract/summary, appropriateness of abstract, introduction and conclusions. - The abstract is adequate, and clear.
Reviewer #3: Remarks to the Author: In the manuscript entitled "A chromosomal-scale genome assembly of modern cultivated hybrid sugarcane provides insights into origination and evolution," the authors describe a haplotype resolved chromosome-scale assembly of the ZZ1 cultivar of sugarcane.The authors describe how they assembled the highly complex ZZ1 genomes using long and short reads couple to HiC.In the abstract the authors claimed they developed new methods ("…series of bioinformatics approaches we developed…"), but it was unclear from the description that any new/novel assembly techniques were used in the process.The authors then go on to look at homoeolog expression dominance and conclude there is none in ZZ1.In addition, the authors look at expression after treatment with the fusarium pest Pokkah boeng, and report that it may be a combination of the subgenomes that provides resistance (based on expression).The manuscript reads much like a genome announcement and only addresses the origin and evolution of the sugarcane genome through the comparison and expression of alleles.Since this is primarily a manuscript about an improved sugarcane genome assembly, the genome needs to be available to the reviewers to assess; impossible to review the content otherwise.Also, the authors do very little to no comparison with the sugarcane genome assemblies that are currently available.
Below are small comments concerning grammar content and writing.
If you are going to use "Rec" to represent the recombined chromosomes, then use it consistently throughout.
Please use ZZ1 or ZZ consistently throughout as the name of the cultivar.
Would be great to add a bit in the introduction about Pokkah boeng (and disease resistance in sugarcane) in the introduction.
Line 45 "….leading to a colossal genome size of ~10 Gb."It is large but not "colossal;" gymnosoperms and above 40 Gb is colossal-this is just a large genome for a plant.
Lines 133-135: "The comprehensive approach was used to detect allele genes within the ultra-complex genome based on the monoploid genomes and annotated genes of two ancestors (see Method)." Sentence does not make sense.
In the section from lines 156-182, the authors use the word "decents;" do the authors mean descendants?It could also be used to mean descent (trajectory) but that does not fit well in the context of what is being said.Line 240: "lineages showed dominant expression (LDE)" Does LDE stand for Lineage Dominant Expression?If so they this sentence needs to be reconstructed to make that clear.

REVIEWER COMMENTS
Reviewer #1 (Remarks to the Author): As the first report of a phased chromosome level assembly of a complex hybrid sugarcane genome, this is an important contribution to the sugarcane community.The authors are to be congratulated for overcoming the technical challenges in assembling this complex polyploid genome.
Response: Thanks for your kind words.We appreciate your recognition of our work as the first report of a phased chromosome-level assembly of a complex hybrid sugarcane genome.We are grateful for the opportunity to contribute to the sugarcane community and overcome the technical challenges.
Two specific questions and comments: Lines 96/97: The texts mentions 'artificial sequences' and 'artificial k-mers'.Does this refer to kimeric sequences generated during sequencing -i.e.artifacts?Is 'artificial' the best word to use here?Or artifactual sequences?
Response: The 'artificial sequences' was one type of error assembly derived from the primarily assembled contigs, and it could be identified by performing K-mers analysis on the Mercury program.In our study, these sequences contained a large proportion (> 40%) of K-mers present only in assembly but absent in the sequencing reads were detected as 'artificial sequences', and then removed.
Yes, the word "artifactual" seems more suitable.We have revised the resubmitted version.Reviewer #2 (Remarks to the Author): Overall evaluation

==================
This manuscript presents a new exciting genome assembly and analysis of the species Saccharum officinarum.This species has a quite complex genome and previous assemblies are incomplete.The latest advances in the sequencing technologies have allowed to the authors to produce an excellent genome assembly.Nevertheless, there are several points that the authors should resolve: 1-The lack of important details in the material and methods, which makes difficult an full assessment; 2-A deeper discussion on the results with a better contextualization and the inclusion of many articles about the complex sugarcane genome; 3-An extended analysis of the genome structure and evolution of this species; 4-Revision of the format of the manuscript including the coherence between text and tables.So overall, it is a worthy to read manuscript.For sure, something publishable from which the community will benefit once the major concerns have been resolved.
Response: Thank you for your valuable feedback on our manuscript.We truly appreciate your kind words, as they are encouraging, especially considering the exhaustive process of assembling such a complex genome.Rest assured, we are fully dedicated to addressing the concerns you raised and making the necessary revisions to enhance the quality of the manuscript further.Your thoughtful evaluation is greatly appreciated, and we are grateful for your support throughout this process.
1-The lack of important details in the material and methods, which makes difficult a full assessment; Response: Thanks for your comments.We added more details in the material and methods section.The supplemented content is as following: Line 461-479: Sample preparation and genome sequencing RNA extraction, sequencing, and profiling.Spore suspension of Ssc (the causative agent of smut) (1×10 6 spores/mL) was inoculated at the growth point of ZZ1, at the young roots and buds, respectively.Inoculated plants were placed in a constant temperature incubator at 28°C in a medium moisturizing culture.Smut group samples were collected at 5d and 20d after inoculation, with water as a control treatment.In addition, the different tissues and different developmental stages were collected from ZZ1, including 'leaf in pre-mature stage (ZBL), stem in pre-mature stage (ZBS), leaf in mature stage (ZCL), stem in mature stage (ZCS), leaf in tillering stage (ZFL), stem in tillering stage (ZFS), leaf in seedling stage (ZYL), and stem in seeding stage (ZYS).'The roots and buds of ZZ9 (the same parents as ZZ1) were collected at 0d, 1d, 2d, 3d, and 4d for each of the above treatments with three duplicates.Total RNA was extracted from the above samples using RNAprep Pure plant Kit (Tiangen Biotech, Beijing, China) and subsequently used for cDNA library construction.The quality of the cDNA library was assessed on the Agilent Bioanalyzer 2100 system and sequenced on an Illumina Novaseq platform.The original data was quality controlled by Cutadapt, and the quality control data was compared to the ZZ1 genome using HISAT2 v2.1.0software.Using Cufflinks software, the expression levels of transcripts and genes were quantified through the position information of Mapped Reads on the genome.FPKM (fragments per kilobase of exon per million fragments mapped) was used as an index to measure the expression level of transcripts.
Line 680-691: Identification of homologous QTL for smut-resistance We identified the QTL region for smut resistance on chromosome Sh06 in sugarcane variety R570, spanning approximately 7.74 Mb and containing 512 genes; as a reference, ZZ1 and ROC22 genomes were used as queries.The JCVI (python version MCScan) was employed to locate the chromosomes in ZZ1 and ROC22, each having the maximum number of collinear genes with the QTL region for smut resistance.Subsequently, the identified chromosomes were used as a query.JCVI was utilized to locate the regions with the most densely populated collinear genes as the QTL regions for smut resistance in R570 on homologous chromosomes of ZZ1 and ROC22.JCVI was used to construct collinearity maps between ZZ1 and the smut QTL regions of R570 and ROC22 and employed eggNOG-mapper for functional annotation of genes within the QTL regions for smut resistance in all three varieties.
2-A deeper discussion on the results with a better contextualization and the inclusion of many articles about the complex sugarcane genome; Response: We added more words to the "Discussions" sections according to your suggestion.The supplemented words are as follows: Line 339-358: Despite significant advancements in the haplotype phased genome of two ancestral Saccharum species, which has dramatically advanced the field of sugarcane genomics, they are still unrepresentative of the genomic information of the modern cultivated sugarcane, which possesses many advantageous traits, such as the combination of high sugar content, super abiotic stress resistance, and other exceptional traits.The resolution of these traits is reliant on the complete deciphering of high-quality modern cultivated sugarcane genomes.However, the modern cultivated sugarcane genome is one of the most intricate and challenging worldwide.Over the past two decades, sugarcane genome research pioneers have expended tremendous efforts on the genome of sugarcane cultivars, yet progress has been limited.Unlike allo-polyploids such as wheat, cotton, and Brassica napus, modern cultivated sugarcane originated from crosses between auto-polyploid parents, followed by multiple rounds of backcrossing.Over a long history of breeding, the bloodlines of several parents within the Sugarcane genus have been mixed, resulting in a homo(eo)aneuploid commercially used in asexual reproduction.Approximately 10-20% of the chromosomes in its genome originated from recombination between the parents, and it possesses 8-14 homo(eo)logous copies of most genes.Therefore, the challenges in assembling the genome of modern cultivated sugarcane include (1) distinguishing between homozygous and heterozygous contigs and achieving chromosome-level scaffolding and (2) assembling and scaffolding chromosomes involving recombination events between the original parental lineages.
3-An extended analysis of the genome structure and evolution of this species; Response: Thanks for your suggestions; we have supplemented the genome structure comparison between ZZ1 and other published sugarcane genomes.---------------------------------The manuscript titled "A chromosomal-scale genome assembly of modern cultivated hybrid sugarcane provides insights into origination and evolution" describes a new genome assembly and analysis of the complex polyploid genome of sugarcane (Saccharum officinarum) including the transcriptomic analysis of several developmental stages.The key results can be summarized as: • Sequencing, assembly, and annotation of the sugarcane accession ZZ1.The assembly produced has a size of 10.4 Gb. 87% of the assembly was anchored into 114 chromosomes.66.5% of the genome was annotated as repetitive elements.The genome annotation delivered 257,534 proteincoding gene models.The origin of 68,509 gene models were identified according the three different sub-genomes.
• The cytogenetic characterization of the sugarcane accession ZZ1 described a genome of approx.9 Gb and 114 chromosomes of which 68 chromosomes derived from S. officinarum, 31 from S. spontaneum, and 15 representing recombination between sub-genomes.
• The authors did not find sub-genome dominance for gene expression, although they found that 7.1% of the gene expression was biased to S. officinarum, and 3.0% to S. spontaneum.
• The pathogen response to Pokkah boeng disease is driven mainly by the S. officinarum, subgenome rather than the S. spontaneum according to a transcriptomic analysis.Nevertheless, the non-overlapping function of the genes of each sub-genome indicates a possible cooperative response between them.
Response: Thanks for the professional summary!B. Originality and significance: if not novel, please include reference This manuscript presents a new exciting assembly of the complex sugarcane genome.There have been several genome assemblies of Saccharum spontaneum (Zhang et al., 2018, Zhang et al., 2021) where the origin of this species is discussed.There also some fragmented genome assemblies for S. officinarum (Garsmeur et al., 2018, Souza et al., 2019), but the incompleteness of the assemblies limited the analysis.
Response: Thanks for your suggestion; we have supplemented the words in the Discussion, and the content is as follows: Line 339-368: Despite significant advancements in the haplotype phased genome of two ancestral Saccharum species, which has dramatically advanced the field of sugarcane genomics, they are still unrepresentative of the genomic information of the modern cultivated sugarcane, which possesses many advantageous traits, such as the combination of high sugar content, super abiotic stress resistance, and other exceptional traits.The resolution of these traits is reliant on the complete deciphering of high-quality modern cultivated sugarcane genomes.However, the modern cultivated sugarcane genome is one of the most intricate and challenging genomes worldwide.Over the past two decades, sugarcane genome research pioneers have expended tremendous efforts on the genome of sugarcane cultivars, yet progress has been limited.Unlike allo-polyploids such as wheat, cotton, and Brassica napus, modern cultivated sugarcane originated from crosses between auto-polyploid parents, followed by multiple rounds of backcrossing.Over a long history of breeding, the bloodlines of several parents within the Sugarcane genus have been mixed, resulting in a homo(eo)aneuploid commercially used in asexual reproduction.Approximately 10-20% of the chromosomes in its genome originated from recombination between the parents, and it possesses 8-14 homo(eo)logous copies of most genes.Therefore, the challenges in assembling the genome of modern cultivated sugarcane include (1) distinguishing between homozygous and heterozygous contigs and achieving chromosome-level scaffolding and (2) assembling and scaffolding chromosomes involving recombination events between the original parental lineages.
To overcome the characteristics of polyploidy and extensive recombination among ancestral subgenomes in the modern sugarcane hybrid genomes, we herein proposed a novel 'dimension reduction' assembly strategy, supplemented by a variety of sequencing technologies, to decipher the modern sugarcane hybrid 'ZZ1' genome completely.The quality of this ultra-complex genome, which benefited from innovations in assembly strategies and technological advances, is far superior to the previously published contig-levels SP-3280, draft chromosome-scale KK3, and mosaic R570 monoploid genome, which not only provides a new perspective on the origin and evolution of modern sugarcane hybrid but also helps to analyze the molecular mechanism of excellent traits, which is of great significance for future precision breeding of sugarcane.
Response: Thanks for your comments.We added more details in the material and methods section.The supplemented content is as following Line 441-447 and 461-479: Illumina short reads sequencing.Genomic DNA was extracted using a QIAGEN DNeasy Plant Mini kit (Qiagen, Hilden, Germany) and subject to library construction with an insert size of 300-500 bp.DNA quality was visually assessed using agarose gel electrophoresis (0.75%), and concentration was estimated using a spectrophotometer (Multiskan Sky Microplate 1510-01307C, Thermo Fisher Scientific, Massachusetts, USA).DNA libraries were sequenced on the Illumina NovaSeq platform with the model of paired-end (PE) 150bp.
RNA extraction, sequencing, and profiling.Spore suspension of Ssc (the causative agent of smut) (1×10 6 spores/mL) was inoculated at the growth point of ZZ1, at the young roots and buds, respectively.Inoculated plants were placed in a constant temperature incubator at 28°C in a medium moisturizing culture.Smut group samples were collected at 5d and 20d after inoculation, with water as a control treatment.In addition, the different tissues and different developmental stages were collected from ZZ1, including 'leaf in pre-mature stage (ZBL), stem in pre-mature stage (ZBS), leaf in mature stage (ZCL), stem in mature stage (ZCS), leaf in tillering stage (ZFL), stem in tillering stage (ZFS), leaf in seedling stage (ZYL), and stem in seeding stage (ZYS).'The roots and buds of ZZ9 (the same parents as ZZ1) were collected at 0d, 1d, 2d, 3d, and 4d for each of the above treatments with three duplicates.Total RNA was extracted from the above samples using RNAprep Pure plant Kit (Tiangen Biotech, Beijing, China) and subsequently used for cDNA library construction.The quality of the cDNA library was assessed on the Agilent Bioanalyzer 2100 system and sequenced on an Illumina Novaseq platform.The original data was quality controlled by Cutadapt, and the quality control data was compared to the ZZ1 genome using HISAT2 v2.1.0software.Using Cufflinks software, the expression levels of transcripts and genes were quantified through the position information of Mapped Reads on the genome.FPKM (fragments per kilobase of exon per million fragments mapped) was used as an index to measure the expression level of transcripts.
The Hi-C map is not very helpful to identify the possible pseudo-chromosomes.I recommend having in the supplementary a figure with the chromosomes with assigned bins and because there are many of them, split the figure in 6-7 so the readers could inspect the results in more details.In this sense, the authors should also report the number of contacts produced by the Hi-C map.
Respond: Thanks for your suggestion.We supplemented the Hi-C maps that split into 6 parts bases on six pre-assigned groups (ROC-So, ROC-Ss, ROC-Rec, YZ-So, YZ-Ss, YZ-Rec) (Supplementary Figure 2) based on the partially Hi-C reads in order to re-run the Hi-C pro in a short period of time.We successfully identified a total of 77,138,022 interaction paired-end reads from the Hi-C map, which accounted for 67.54% of all valid Hi-C reads.Among these, there were 59,906,451 lib valid paired-end reads, representing 77.66% of all interaction paired-end reads.We supplemented this description in the LINE122-123 of revised manuscript.The assessment of the quality of the assembly is partial.The authors have used Merqury but not reported the results about consensus quality and completeness (at least not in the result section).No assessment of the completeness of the LTR elements (e.g., using LTR_retriever).The contamination screening was performed only by Blast against bacterial genomes, but not, for example, fungi.The use of Blobtools is recommended in this case.
Respond: Thanks for your valuable suggestion.We supplemented the Merqury completeness and quality value (QV) of ZZ1 genome, please see Line 569-576.We successfully identified 716,844 complete LTR elements, spanning a total length of 2.8G.These findings account for 62.91% of all LTR transposable elements.We also calculated the LAI (LTR Assembly Index) to assess the genome assembly.It shows that the ZZ1 genome has a LAI value of 12.27, qualified as a reference genome (Line 140-142).
To address the issue of contaminating sequences and enhance the quality of the ZZ1 genome, we took several steps.Firstly, we downloaded sequences from the NCBI database, including bacterial genomes and plant organelle genomes, to establish a collection of potential contaminating sequences.Subsequently, we employed BLASTN to compare these sequences against the ZZ1 genome and filter out any contaminants.Additionally, we attempted to use the Blobtools process, as you recommended, for further contaminant filtering.To expedite the process, we divided the ZZ1 genome into 50 segments and ran the analysis in parallel.However, due to the substantial size of the dataset and genome, the entire process required several weeks and significant computational resources.Unfortunately, we were unable to complete the process within a reasonable timeframe and had to abandon this approach.
There are some parameters missing in the gene model annotation that could be useful to assess the quality such as the number of single exon gene, length of the gene models, numbers of UTR annotated, number of genes supported by RNA-Seq data… No description of the RNA-Seq analysis, including the number of replicates per sample.
Response: Thanks for your comments.We also supplemented the gene model annotation parameters as you suggested: Line 156-160: A total of 370,103 protein-coding genes, spanning a combined length of 1235.86 Mb, were annotated by protein-homology-predicted and RNA-seq-aligned methods, of which 92.14% (341,040) could be successfully validated by RNA-seq reads.Out of all the annotated genes, 30.03% (111,167) consisted solely of a single exon, while 14.26% (52,797) and 15.75% (58,308) genes had 5'UTR and 3'UTR annotations.
We supplemented the description of the RNA-Seq analysis as following: Line 465-479: The different tissues and different developmental stages were collected from ZZ1, including 'leaf in pre-mature stage (ZBL), stem in pre-mature stage (ZBS), leaf in mature stage (ZCL), stem in mature stage (ZCS), leaf in tillering stage (ZFL), stem in tillering stage (ZFS), leaf in seedling stage (ZYL), and stem in seeding stage (ZYS).'The roots and buds of ZZ9 (the same parents as ZZ1) were collected at 0d, 1d, 2d, 3d, and 4d for each of the above treatments with three duplicates.Total RNA was extracted from the above samples using RNAprep Pure plant Kit (Tiangen Biotech, Beijing, China) and subsequently used for cDNA library construction.The quality of the cDNA library was assessed on the Agilent Bioanalyzer 2100 system and sequenced on an Illumina Novaseq platform.The original data was quality controlled by Cutadapt, and the quality control data was compared to the ZZ1 genome using HISAT2 v2.1.0software.Using Cufflinks software, the expression levels of transcripts and genes were quantified through the position information of Mapped Reads on the genome.FPKM (fragments per kilobase of exon per million fragments mapped) was used as an index to measure the expression level of transcripts.
The quality of the genome assembly should be described using the standards of the Earth Biogenome Project (https://www.earthbiogenome.org/assembly-standards).
Response: Thanks for your comments.We have revised in the revised version.E. Conclusions: robustness, validity, reliability ------------------------------------------------------The conclusions are valid according to the results presented.However, the lack of details makes difficult a full assessment (for example, for the expression analysis, it needs to be assumed that the authors used three independent replicates of plants challenged by the pathogen and with clear phenotypes of the infection… none of these details are in the material and methods).
Response: Thanks for your comments.We added more details in the material and methods section.The supplemented content is as follows: Line 461-479.
RNA extraction, sequencing, and profiling.Spore suspension of Ssc (the causative agent of smut) (1×10 6 spores/mL) was inoculated at the growth point of ZZ1, at the young roots and buds, respectively.Inoculated plants were placed in a constant temperature incubator at 28°C in a medium moisturizing culture.Smut group samples were collected at 5d and 20d after inoculation, with water as a control treatment.In addition, the different tissues and different developmental stages were collected from ZZ1, including 'leaf in pre-mature stage (ZBL), stem in pre-mature stage (ZBS), leaf in mature stage (ZCL), stem in mature stage (ZCS), leaf in tillering stage (ZFL), stem in tillering stage (ZFS), leaf in seedling stage (ZYL), and stem in seeding stage (ZYS).'The roots and buds of ZZ9 (the same parents as ZZ1) were collected at 0d, 1d, 2d, 3d, and 4d for each of the above treatments with three duplicates.Total RNA was extracted from the above samples using RNAprep Pure plant Kit (Tiangen Biotech, Beijing, China) and subsequently used for cDNA library construction.The quality of the cDNA library was assessed on the Agilent Bioanalyzer 2100 system and sequenced on an Illumina Novaseq platform.The original data was quality controlled by Cutadapt, and the quality control data was compared to the ZZ1 genome using HISAT2 v2.1.0software.Using Cufflinks software, the expression levels of transcripts and genes were quantified through the position information of Mapped Reads on the genome.FPKM (fragments per kilobase of exon per million fragments mapped) was used as an index to measure the expression level of transcripts.
Lines 211-213: "Sugar transporter family: A total of 130 genes, including 1,166 alleles, likely belong to the members of the sugar transporter superfamily consisting of PLT, VGT, SFP, TMT, STP, pGlcT, INT, SUT, and SWEET subfamily."Spell out Line 219: "…and their reconstruction in the leaf of the different development stage…" Not sure what this means; please clarify.Lines 221-223: "In addition, the PLT, TMT, and SWEET families had more gene members than S. spontaneum, S. officinarum, and their relative species, suggesting that gene expansion occurred in these three families during the breeding of ZZ." Lines 225-231: The section on NBS has very little content.
-linking Saccharum chromosome nomenclature from Sorghum, and creating a specific Saccharum nomenclature.The Garsmeur et al 2018 proposal is summarized below.

Supplementary Table 1. Correspondence table of sugarcane chromosome nomenclature
Thank you for your perspective, which is indeed insightful.We fully agree with your viewpoint.The breeding process of modern cultivated sugarcane is complex, involving the incorporation of lineages from multiple clones.The genes associated with important agronomic traits are likely derived from the intermingling of various lineages.However, gene family expansion may contribute to increased gene family members.This is because the formation of hybrid species can trigger genetic evolutionary mechanisms, including transposon jumping, leading to genome expansion(Ungerer et al., 2006; Ishikawa et al., 2009).Of course, this requires more data to explore next.Therefore, in light of these two viewpoints, we have incorporated them into the revised manuscript.Ishikawa R, Kinoshita T. Epigenetic programming: the challenge to species hybridization.Mol Plant.2009Jul;2(4):589-599.One general comment relating to genome nomenclature and its implications for the sugarcane genomics community.It has been recognized from some time that the origin of modern sugarcane hybrids are S.spontaneum with x=8 and S.officinarum with x=10, and that structural rearrangements between the genomes of the two species will lead to issues in chromosome nomenclature.Without standardization, this will lead to confusion in the literature.For example, in this submission, the nomenclature of LA Purple was used (based on sorghum chrm nomenclature), with the results that Chr5 and Chr 8 are 'missing' for SS, while Chr 9 and Chr10 are 'present' for SSwhereas the published genome for SS(Zhang et al, 2018), names SS chromosomes for 1 to 8i.e.there is a lack of correspondence between the chromosome nomenclature of Zhang 2018, with this submission.The proposed solution to this problem was published by Garsmeur et al 2018, by de-linking Saccharum chromosome nomenclature from Sorghum, and creating a specific Saccharum nomenclature.The Garsmeur et al 2018 proposal is summarized below.By adopting this nomenclature, future SS genomes can be named 1 to 8 and future hybrid genomes named 1 to 10, without confusion.It is strong suggested to the authors to consider adopting the nomenclature proposed by Garsmeur 2018 in order to facilitate and standardize sugarcane chromosome nomenclature.See attached .docResponse:Thankyou for your highly professional suggestions.Your advice holds great significance in advancing the field of sugarcane research.We are pleased to adopt the proposed nomenclature and have corrected the text and figures throughout the manuscript accordingly.Additionally, to facilitate a faster and more accurate understanding of the chromosomal correspondence between hybrid sugarcane and previously published works, includingGarsmeur et al. 2018, Zhang et al. 2018,  Zhang et al. 2022, and sorghum, we have included a chromosome numbering table as an attachment to the revised manuscript.The table is as following: Thank you!Lines 222/223: Its seems that the inference here is that because the variety ZZ1 has more sugar transport gene members than the reference genomes of S.spontaneum (Clone AP85-441) and S.officinarum (LA Purple), that gene expansion occurred during breeding.It is known, however, that at least 19 different S.officinarum clones and a range of S.spontaneum, S,barberi and S.sinense clones contribute to the genealogy of modern hybrids.It is likely that the additional gene members found in ZZ1 are derived from other ancestral clones of SO and SS, and not due to gene expansion during the several generations of breeding that led to variety ZZ1.Response: Ungerer MC, Strakosh SC, Zhen Y. Genome expansion in three hybrid sunflower species is associated with retrotransposon proliferation.CurrBiol.2006Oct 24;16(20):R872-R873.